viz

Vizualization routines

Originally written for https://github.com/zqevans/audio-diffusion/blob/main/viz/viz.py


source

embeddings_table

 embeddings_table (tokens)

make a table of embeddings for use with wandb


source

proj_pca

 proj_pca (tokens, proj_dims=3)

this projects via PCA, grabbing the first 3 dimensions

3D Scatter plots

To visualize point clouds in notebook and on WandB:

source

pca_point_cloud

 pca_point_cloud (tokens, color_scheme='batch', output_type='wandbobj',
                  mode='markers', size=3, line={'color':
                  'rgba(10,10,10,0.01)'})

returns a 3D point cloud of the tokens using PCA

Type Default Details
tokens embeddings / latent vectors. shape = (b, d, n)
color_scheme str batch ‘batch’: group by sample, otherwise color sequentially
output_type str wandbobj plotly | points | wandbobj. NOTE: WandB can do ‘plotly’ directly!
mode str markers plotly scatter mode. ‘lines+markers’ or ‘markers’
size int 3 size of the dots
line dict {‘color’: ‘rgba(10,10,10,0.01)’} if mode=‘lines+markers’, plotly line specifier. cf. https://plotly.github.io/plotly.py-docs/generated/plotly.graph_objects.scatter3d.html#plotly.graph_objects.scatter3d.Line

To display in the notebook (and the online documenation), we need a bit of extra code:


source

show_pca_point_cloud

 show_pca_point_cloud (tokens, color_scheme='batch', mode='markers',
                       line={'color': 'rgba(10,10,10,0.01)'})

display a 3d scatter plot of tokens in notebook


source

setup_plotly

 setup_plotly (nbdev=True)

Plotly is already ‘setup’ on colab, but on regular Jupyter notebooks we need to do a couple things


source

on_colab

 on_colab ()

Returns true if code is being executed on Colab, false otherwise

Test the point cloud viz inside a notebook:

tokens = torch.rand((16,32,152))
show_pca_point_cloud(tokens)  # default, no lines connecting dots

Or we can add lines connecting the dots, such as a faint gray line:

show_pca_point_cloud(tokens, mode='lines+markers')

Spectrograms


source

mel_spectrogram

 mel_spectrogram (waveform, power=2.0, sample_rate=48000, db=False,
                  n_fft=1024, n_mels=128, debug=False)

calculates data array for mel spectrogram (in however many channels)


source

spectrogram_image

 spectrogram_image (spec, title=None, ylabel='freq_bin', aspect='auto',
                    xmax=None, db_range=[35, 120], justimage=False)

Modified from PyTorch tutorial https://pytorch.org/tutorials/beginner/audio_feature_extractions_tutorial.html


source

audio_spectrogram_image

 audio_spectrogram_image (waveform, power=2.0, sample_rate=48000,
                          print=<built-in function print>, db=False,
                          db_range=[35, 120], justimage=False, log=False)

Wrapper for calling above two routines at once, does Mel scale; Modified from PyTorch tutorial https://pytorch.org/tutorials/beginner/audio_feature_extractions_tutorial.html

Let’s test the above routine:

spec_graph = audio_spectrogram_image(waveform, justimage=False, db=False, db_range=[-60,20])
display(spec_graph)
/Users/shawley/opt/anaconda3/envs/shazbot/lib/python3.8/site-packages/torchaudio/functional/functional.py:571: UserWarning:

At least one mel filterbank has all zero values. The value for `n_mels` (128) may be set too high. Or, the value for `n_freqs` (513) may be set too low.

‘Playable Spectrograms’

Source(s): Original code by Scott Condron (@scottire) of Weights and Biases, edited by @drscotthawley

cf. @scottire’s original code here: https://gist.github.com/scottire/a8e5b74efca37945c0f1b0670761d568

and Morgan McGuire’s edit here; https://github.com/morganmcg1/wandb_spectrogram

int(np.random.rand()*10000)
8890

source

playable_spectrogram

 playable_spectrogram (waveform, sample_rate=48000, specs:str='all',
                       layout:str='row', height=170, width=400,
                       cmap='viridis', output_type='wandb', debug=True)

Takes a tensor input and returns a [wandb.]HTML object with spectrograms of the audio specs : “all_specs”, spetrograms only “all”, all plots “melspec”, melspectrogram only “spec”, spectrogram only “wave_mel”, waveform and melspectrogram only “waveform”, waveform only, equivalent to wandb.Audio object

Limitations: spectrograms show channel 0 only (i.e., mono)

Type Default Details
waveform audio, PyTorch tensor
sample_rate int 48000 sample rate in Hz
specs str all see docstring below
layout str row ‘row’ or ‘grid’
height int 170 height of spectrogram image
width int 400 width of spectrogram image
cmap str viridis colormap string for Holoviews, see https://holoviews.org/user_guide/Colormaps.html
output_type str wandb ‘wandb’, ‘html_file’, ‘live’: use live for notebooks
debug bool True flag for internal print statements

source

generate_melspec

 generate_melspec (audio_data, sample_rate=48000, power=2.0, n_fft=1024,
                   win_length=None, hop_length=None, n_mels=128)

helper routine for playable_spectrogram

Sample usage with WandB:

wandb.init(project='audio_test')
wandb.log({"playable_spectrograms": playable_spectrogram(waveform)})
wandb.finish()

See example result at https://wandb.ai/drscotthawley/playable_spectrogram_test/

Test the playable spectrogram:

HTML(playable_spectrogram(waveform, output_type='html_file'))
Panel

Let’s show off the multichannel waveform display:

mc_wave = load_audio('examples/stereo_pewpew.mp3')
playable_spectrogram(mc_wave, specs='wave_mel', output_type='live')
Resampling examples/stereo_pewpew.mp3 from 22050 Hz to 48000 Hz
playable_spectrogram(waveform, specs="waveform", output_type='live')

source

tokens_spectrogram_image

 tokens_spectrogram_image (tokens, aspect='auto', title='Embeddings',
                           ylabel='index')

for visualizing embeddings in a spectrogram-like way

tokens_spectrogram_image(tokens)

source

plot_jukebox_embeddings

 plot_jukebox_embeddings (zs, aspect='auto')

makes a plot of jukebox embeddings